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ABSTRACT 



We discuss how current and future data on the clustering and number density 
of z ~ 3 Lyman-break galaxies (LBGs) can be used to constrain their relationship 
to dark matter haloes. We explore a three-parameter model in which the number of 
LBGs per dark halo scales like a power-law in the halo mass: N{M) = (M/Mi)^ for 
M > Mniin. Here, Mj^in is the minimum mass halo that can host an LBG, Mi is 
' a normalization parameter, associated with the mass above which haloes host more 

, than one observed LBG, and S determines the strength of the mass dependence. We 

\^ • show how these three parameters are constrained by three observable properties of 

' LBGs: the number density, the large-scale bias, and the fraction of objects in close 

pairs. Given these three quantities, the three unknown model parameters may be 
estimated analytically, allowing a full exploration of parameter space. As an example, 
fH I we assume a ACDM cosmology and consider the observed properties of a recent sample 

Q^i of spectroscopically confirmed LBGs. We find that the favored range for our model 

I ; parameters is Mmi„ ~ (0.4 - 8) x lO^^h'^M^ , Mi ~ (6 - 10) x lO^^/j-ij^^ ^ a,nd 

0| 0.9<S'<1.1. The preferred region in Mmin expands by an order of magnitude and 

slightly shallower slopes are acceptable if the allowed range of bg is permitted to span 
all recent observational estimates. We also discuss how the observed clustering of LBGs 
as a function of luminosity can be used to constrain halo occupation, although due 
to current observational uncertainties we are unable to reach any strong conclusions. 
Our methods and results can be used to constrain more realistic models that aim to 
derive the occupation function N{M) from first principles, and offer insight into how 



t3 



' basic physical properties affect the observed properties of LBGs. 

Key words: cosmology: theory — galaxies :high-redshift — galaxies:haloes — galax- 
ies:formation — dark matter 



1 INTRODUCTION 



The Lyman-break color selection technique has made pos- 
sible the compil ation of a large, fa i rly complete sample of 
3 galaxies (^teidel et al. 19981: lAdclbcrgcr ct al. 19981 



ing, since the way in which galaxies populate haloes, both 
in number and in luminosity, depends on aspects of galaxy 
formation that are as of yet poorly understood, such as the 
efficiency of star formation a n d feedback processes (see, e.g. 



hereaft( ;r A98; Ad^lbergcr'2001^~Tlhr'samp^ 

bust estimates of the number densities and clustering prop- 
erties of bright, high-redshift galaxies, which can lead to in- 
valuable constraints on models for the evolution of structure 



in the Universe and high-z galaxy formation (A98 ; Siavalisco 



et al. 199^; Stcidol et al. 1999|; lAdclbcrgcr 20001 piavalisco 



fc Dickinson 2001; Porciani fc Giavalisco 2001) 



In the CDM framework, given a power spectrum and a 
cosmology, the number densities and clustering properties of 
dark matter haloes can be readily estimated at any redshift, 
either by analytic methods or N-body simulations. Relating 



galaxies to these dark haloes is significantly more challeng- 



Somerville & Primack 1999; Somerville, Primack & Faber 



2001a; Wechsler et al. 2001, hereafter WOl). However, once 
the cosmological model is specified, the observed clustering 
properties of galaxies can potentially be used to constrain 
the nature of galaxy assembly. 

The current observational estimates of the number den- 
sity and large-scale clustering amplitude, or bias, for these 
2: ~ 3 Lyman-break galaxies (LBGs) are reasonably consis- 
tent with a model in which there is o ne galaxy in each halo 
more massive than some threshold ( Mo fc Fukugita 1996 ; 
A98; IWechsIer et al. 19981 iJing fc Suto 1998|; |BagIa 199^; 



Coles et al. 199S; Moscardini et al. 1998; Arnouts et al 



199£; WOl). In more detailed models of galaxy formation, 
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however, the association between haloes and galaxies is not 
expect ed to be this simple. For example, even if the most 



Benson et al. 2000|; IScljak 200(]|; [Peacock fc Smith 2o6c 



luminoi is high-redshift galaxies are quiescently star forming 
objects that reside in massive haloes at z ~ 3, we expect that 
these haloes will contain substructure comprised of haloes 
that formed at earlier epochs and have merged to become 
subhaloes of more massive hosts (e.g., [Klypin et al. 1999 



Moore et al. 1999 



[Springel 1999| ; [Bullock et al 



2001 



WOl). 

If the high-redshift galaxies are merger-triggered starbursts, 
again we expect to find multiple galaxies per halo, perhaps 
with different occupation statistics than would be predicted 



in a sc enario dominated by quiescent star formation ( Ko- 



latt et , il. 1999 



Bullock et al. 1999 



Somerville et al. 2001a 



WOl). It is therefore useful to examine a more general sce- 
nario for populating haloes with galaxies, and to explore 
ways of constraining the halo-galaxy relation directly. 

In this paper, we will focus on how the number densities 
and clustering properties of LBGs can be used to constrain 
the z ~ 3 galaxy halo occupation function, Ng{M), which 
describes the typical number of observed galaxies within a 
halo of mass M. In addition to using the number density 
and large scale clustering amplitude of high-z galaxies to 
constrain the model, we make use of a statistic which reflects 
the small-scale clustering, the fraction of galaxies in close 
pairs over narrow redshift bins (the close pair fraction). As 
an example of how this can be applied, we use the number 
density, bias, and close pair statistics derived fror n 802 LBGs 
from the spectroscopically-confirmed sample of Adelberger 
(200l|3 

o derive constraints on the general nature of the 
halo occupation function. Our framework is also applied to 
predict clustering trends as a function of luminosity, and 
should prove useful for interpreting future observations of 
high-z galaxies. 



In some respects, this work extends that of Wechsler 
et al. ( ^000| ) and WOl, in which we used semi-analytic mod- 
els of galaxy formation to predict Ng{M) and then calcu- 
lated the clustering properties of LBGs using N-body sim- 
ulations. Here, we seek to constrain the halo occupation 
function directly, using analytic approximations. We adopt 
a simple functional form for the number of galaxies per halo 
as a function of halo mass: 



Ng{M;Mi,M^in,S) 



/My 



M > Mn 



(1) 



This relation, which is motivated by the more detailed semi- 
analytic modelling mentioned above, has three free param- 
eters: Mmin, the minimum mass halo capable of hosting an 
observable LBG; Afi, a normalization parameter, which may 
be interpreted as the critical mass above which haloes typi- 
cally host more than one observed galaxy; and S, the slope 
of the relation. In principle, any model of galaxy forma- 
tion that aims to explain LBG properties can predict the 
value of each of these parameters (as long as the observa- 
tions can be reasonably well described as a power law over 
some mass range; if not, the approach discussed here can 
easily be extended to more complicated functional descrip- 
tions). Derived constraints on Mmin, Mi and S can serve as 
constraints on more sophisticated models and ultimately as 
a probe of the underlying physics of galaxy formation. 

Similar approaches, focusing mainly on the clustering 
properties of local galaxies, have been perf ormed to explore 
the z — halo occupation function (e.g. Jing et al. 1998 



Bcoccimarro ct al. 2001; Benson 2001; Berlind & Weinberg 



2001; Cooray 2001). Our focus on halo occupation at high 



redshift is complementary to these local explorations, since 
together they provide a potential probe for the evolution 
of star formation and galaxy assembly. The expected clus- 
tering properties of galaxies (and dark matter) in this type 
of model can, in principle, be determined continuously over 
all relevant length scale s using analytic method s similar to 
those presented by e.g., gcoccimarro et al. (2001 ). However, 
the existing observational samples at z ~ 3 are currently 
too small to obtain accurate estimates of the full correlation 
function and its moments. For this reason, we focus on two 
measures of the clustering amplitude, one at scales larger 
than the size (virial radius) of typical dark matter haloes, 
reflecting the clustering properties of individual haloes, and 
one at small scales, reflecting mainly the statistics of multi- 
ple galaxies within common dark matter haloes. 

In the next section (§P) we summarize the current ob- 
servational determinations at high redshift {z ~ 3) of the 
three main quantities used in our investigation: the comov- 
ing number density, Ug , the large scale bias bg (which may be 
related to the correlation length ro), and the number density 
of close pairs, Ucp, which may also be expressed as the close 



pair fraction, /c; 



o/wg. In the following section (g3|), we 



outline our approach for predicting these three quantities 
using our halo occupation model and analytic approxima- 
tions for the clustering properties of dark matter haloes. In 
we use the observed estimates for the three numbers rig , 
bg, and fcp to place constraints on the three model param- 
eters Ml, Mmin, and S. In §^ we use our model to make 
predictions for clustering segregation with luminosity, and 
discuss how current and future observations help place fur- 
ther constraints on halo occupation models. We reserve §^ 
for discussion and conclusions. In all calculations, we adopt 
a flat CDM model with a non-zero vacuum energy and the 
following parameters: f2m = 0.3, SIa ~ 0.7, h = 0.7, erg = 0.9, 
where erg is the rms fluctuation on the scale of 8h~^ Mpc, 
h is the Hubble constant in units of 100 km s~^Mpc~^, and 
fim and JIa are the density contributions of matter and the 
vacuum respectively in units of the critical density. 



2 OBSERVATIONAL QUANTITIES AND 
ASSOCIATED UNCERTAINTIES 

We focus this investigation on a relatively large sample of 
Lyman-break galaxies, selected from a ground-based catalog 
Un, G and TZ pho tometr y, which is compl ete to 7?. = 25.5 
( ^teidel et al. 1998[ 



A98; Adelberger 2001). Spectroscopic 



foUowup has been performed for a subset of the photomet- 
ric candidates, leading to successful redshift identiflcations 
for about 45 percent of the total sample of photometrically 
selected LBGs. All of the galaxies with spectroscopic iden- 
tiflcations have redshifts in the range 2.2 « ^ 3.8 (median 
redshift z ~ 3). K. Adelberger has kindly provided us with 
the data for 802 spectroscopically confirmed LBGs, which 
consist of the 500 galaxies in the sample described in A98, 
plus 302 additional galaxies. At the time of writing it com- 
prises the largest and most complete sample of this kind. We 
shall refer to this as the AOl sample. Recent analyses of the 
clustering properties of subsets of this data have been pre- 
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sented by A98, [Giavalisco et al. (1998| ), [Adelberger (2000|) 



Giavalisco fc Dickinson (2001, hereafter GDOl), and 



ciani & 



Por- 



Giavalisco (2001), and we shall also make use of 



these results. 

We choose statistics that can be calculated reasonably 
robustly from this sample, and which produce constraints on 
the three free parameters of our model. The statistics that 
we shall consider are: 

(i) the comoving galaxy number density, Ug 

(ii) the large-scale galaxy bias, bg 

(iii) the close pair fraction, fcp 

Each of these statistics can be derived directly from the 
data with a small number of additional assumptions. We 
discuss the definitions of each of these quantities in more 
detail below. 



2.1 Number Density 

Consider a population of galaxies with a given magnitude 
limit and at a given redshift, with a (comoving) volume den- 
sity ritrue. Of course, no observed sample of galaxies is per- 
fectly complete, and so the observed density rioba differs from 
the underlying, true density by a factor p = nobs/wtruo. Here, 
the observed number density is just the number of galaxies 
actually observed per unit redshift and solid angle, A^obs, di- 
vided by dV = 3^^(z), the comoving volume element per 
unit redshift and solid angle. In the case of the Lyman-break 
galaxies, which are pre-selected by color, the observed pop- 
ulation may differ from the underlying one at a given mag- 
nitude limit for several reasons. Galaxies may be missing 
from the sample because of confusion blending with nearby 
sources, or because their observed colors lie outside the selec- 
tion window, either intrinsically or due to scattering because 
of photometric errors. In addition, spectroscopic follow-up is 
only attempted for some fraction of the color-selected candi- 
dates, and not all of these are successfully assigned redshifts, 
usually because of insufficient signal-to-noise. At a given red- 
shift, we can write the relationship between the number of 
galaxies in the true and observed population as: 



Aobs — /spec /phot A'tr 



(2) 



where /spec = A'spec/Aphot is the fraction of photomet- 
ric candidates with successful redshift identifications, and 
/phot = A'phot/Atruo is the fraction of the underlying 
population selected by the color-color criteria. In princi- 
ple, both of these terms may depend on redshift. We can 
write the incompleteness of the photometric sample as 
/phot = /peak</>(2), where <l>{z) = Aobs(2)/A'peak is the peak- 
normalized selection function, which is just the observed red- 
shift distribution normalized by the value at the peak. 

If we make the simplifying assumption that the prob- 
ability of obtaining a successful redshift /spec does not de- 
pend on the redshift itself over the relevant range, then we 
can write the overall selection probability p as the product 
of three parts: 



— f f ^eff 

P — ./spec /peak -j^ 

Vtop 



(3) 



Here, Vtop is the volume per unit area integrated over the 
redshift range, and T4fi is the selection function weighted 
volume per unit area: 



Veff= / <l){z)dV{z)dz 
10 



(4) 



The selection function (jy^z) may be constructed from the 
measured redshift distribution averaged over many fields. 
This function is roughly Gaussian, with a mean oi z ^ ?i 
and a width of cr^ ~ 0.24 (see A98; Giavalisco et al. 1998). 



Thus two of the components of p are well constrained obser- 
vationally: the fraction /spec is trivially determined by relat- 
ing the number of galaxies in the spectroscopic sample with 
the original number of photometric candidates — our sam- 
ple of 802 galaxies was selected from a population of 1781 
photometric candidates, so in this case /spec = 0.45. What 
is uncertain is to what degree the spectroscopic sample is bi- 
ased towards objects of brighter magnitudes (GDOl); Eqn. ^ 
assumed that /spec was independent of the magnitude of the 
candidate. If there is a strong bias toward brighter galaxies 
affecting the completeness as a function of magnitude, then 
the effective /spec and p values would be increased to get a 
lower value of ntme for this brighter sample. The contribu- 
tion to p due to the selection function is also well-constrained 
observationally: integrating over the selection function gives 
Veff/Vtop = 0.52. The factor /peak is the most uncertain, but 
is probably in the range 0.5 to 1.0. Taken together, favored 
values are in the range p = 0.1-0.3. In WOl, we assumed a 
value of p = 0.14. 

We calculate the observed number density Ug directly 
by dividing the number of galaxies in the AOl sample by 
the volume of the region subtended by the total angular 
size of the survey (nine 9 arcmin^ fields, one 6.5 arcmin^ 
field, and three 7 x 14 arcmin fields) over the redshift range 
2.5 ^ z ^ 3.5. The implied observed number density is then 
rig — 6.6 X 10~'*/i'^Mpc~'^. The error on this number from 
cosmic variance should be small (~ 3 per cent, based on re- 
sampling a large volume N-body simulation), so we neglect 
it for the purposes of normalizing our models. Because of the 
remaining uncertainty in the value of the selection probabil- 
ity p, we will work only with the observed number density. 
The constraints that we obtain can then be translated back 
to the values relevant to the underlying, intrinsic population 
if and when the value of p is determined. 



2.2 Bias 

We define the bias as the square root of the ratio between 
the galaxy correlation function and dark matter correlation 
function: bg = [^g/^ujif]^^'^. It should be noted that several 
different definitions of bias are used in the literature, and 
are not equ ivalent (see, e.g 



Dekel& Lahav 1998; 



Somerville 



et al. 2001). Therefore, caution should be used when com- 



paring bias values given by different authors. An additional 
complication is that the bias may be a function of the spatial 
scale on which it is measured. If we adopt a cosmology and 
power spectrum, a definition of bias, and a spatial scale, and 
if the correlation function of galaxies is well-represented as 
a power law, ^ = (r/ro)~^, then for any galaxy population 
where ro and 7 are determined, we can translate this to a 
bias value. 

We wish to define the 'large-scale' bias on a scale that is 
larger than the size of individual haloes, so that it is mainly 
affected by the clustering properties of distinct haloes them- 
selves, and is not significantly affected by halo exclusion ef- 
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fects or the occupation function within haloes. This scale is 
approximately > 1-2 /i~^Mpc for the relevant halo masses 
at 2; = 3. However, we encounter a problem in defining a 
sensible value of the large scale bias for the case of LBGs at 
z = 3. The correlation function of the dark matter at 2 = 3 
in our chosen cosmological model|^ cannot be well fit by a 
single power law, rather it resembles a broken power-law 
with the break occurring at about 1 h~^Mpc . The slope at 
scales smaller than 1 /i~^Mpc is roughly 'yoM ~ 1-5, and at 
larger scales the slope changes to a shallower slope of about 
7i3A/ ~ 1-2. If the correlation function of LBGs is really a 
power-la w with a slope of 1.5 — 1.6, as indicated by obser- 



vations (Adelberger 2000; Porciani & Giavalisco 2001), then 



this implies that the bias on scales > 1 /i~ Mpc is strongly 
scale-dependent, and does not assymtote to a stable value 
on any reasonable scale. The implied bias values range from 
b ~ 1.8 at a scale of 8 h~^Mpc , which is the largest radius 
where the correlation function of LBGs is observationally 
determined, to & ~ 2.6 at 1 /i~^Mpc . At no point is the bias 
really constant over any significant range of scales. However, 
we do not wish to deal explicitly with the scale dependence 
of the bias and the detailed shape of the correlation func- 
tion here, as the current observational constraints do not 
warrant such a detailed investigation. We thus assume that 
both the halos and dark matter have 7 = 1.5 on all scales, 
and can thus define the bias as 6 = {tq.dm /ro,g)~^'^^'^ , with 
ro,DAi ~ 1.2 /i~^Mpc . This approximation works best at 
scales of about 2 /i~^ Mpc , but we find that the approx- 
imation for halo bias (sheth et al. 2001) that we discuss 
in the following section, in combination with this assump- 
tion about the dark matter correlation function, agrees with 
the halo correlation function measured in simulations within 
about 10 per cent over all relevant scales. This is basically 
equivalent to defining the bias at a scale of ~ 2 h~^Mpc . 

The observational estimates of ro and 7 for LBGs vary 
somewhat depending on the sample and the technique used 
to obtain the three-dimensional, real-space correlation func- 
tion from the projected or redshift space data. A summary 
of the observationally derived correlation function parame- 
ters and implied bias values is given in Table |l[ We do not 
have a measured bias value for th e full AOl sample, but this 
sample is quite close to that of the Adelberger (200C| ) sample 
and we use those values to constrain our model. 



2.3 Close Pairs 

The close pair count Wcp describes the number of pairs of 
galaxies within a fixed angular separation on the sky and 
within a fixed separation in redshift ±Az. The value of Ticp 
provides a useful probe of small-scale clustering and is espe- 
cially sensitive to halo occupation statistics. If the angular 
separation is chosen to be slightly larger than the typical an- 
gular size of a halo (~ 0.4 h~^Mpc comoving, or ~ 20 arcsec 
for our cosmology), then the close pair fraction can probe 
the number of objects within haloes without being sensitive 
to the details of how galaxies are distributed spatially within 
them. Here, we focus on this single angular scale, which we 



^ We have calculated the correlation function for the dark matter 
from the publically available GIF simulation, described in some 
detail in WOl. 



find to be most useful in constraining our chosen model pa- 
rameters, although in principle of course a range of angular 
separations could be investigated (as we did in WOl). In 
order to best separate the effects of projection, it is useful 
to limit the pair counts to those galaxies that are within a 
small redshift range of each other; however, the resolution 
of the data is not sufficient to completely remove projection 
effects. Looking on different scales may also help to distin- 
guish which pairs are in the same halo. We thus calculate 
the close pair counts for several choices of the redshift bin 
size for the AOl sample. Defining the close pair fraction as 
just the number of close pairs divided by the total number 
of galaxies, fcp = ricp/ng, we find /cp(20") = 0.010 ± 0.004, 
0.015 ± 0.004, and 0.022 ± 0.005 for redshift bins of size 
Az = 0.005, 0.010, and 0.040 respectively. The errors reflect 
1 — a statistical uncertainties. Note that there does not seem 
to be a strong bias against selecting close pairs; the small- 
scale spectroscopic pair counts with no redshift selection are 
almost identical to those of the photometric sample (after 
taking number density into account; see also WOl). 



3 A GENERAL MODEL FOR GALAXY 
CLUSTERING 

In this section we present the analytic expressions used to 
predict the three observables introduced above in Section ^ 
the number density of observed galaxies, Ug, the large-scale 
galaxy bias, bg, and the close pair fraction, fcp. In the expres- 
sions for derived quantities that follow, we will suppress the 
S,M\, and Mmin, dependence — such a dependence should 
be assumed unless otherwise noted. 

The comoving number density of galaxies is the integral 
over dn/dM, the differential number density of dark mat- 
ter haloes as a function of halo mass M, weighted by the 
appropriate galaxy occupation function: 



^[M)Ng{M)dM. 



(5) 



For the halo mass function, we use th e analytic expression 
developed by Sheth & Tormen (199£), which agre es fairly 



well with t h e results of N-body simulations ( see e.g. Jenkins 
et al. 200l|; ^igad et al. 2000| [Wechsler 200^): 



dn p da 

dM " ~MdM 



1 + (au^ 



exp[ 



(6) 



Here, a is the linear rms variance of the power spectrum on 
the mass scale M at redshift z = 3 and f = Sc/cr, where 
5c — 1.686 is the critical overdensity value for collapse. The 
other parameters are a — 0.707, g — 0.3, and w = 0.163, 
which were chosen to match N-body simulations with the 
same cosmology and power spectrum as the one we have 
assumed. 

We determine the large-scale bias for galaxies by inte- 
grating the expected bias of haloes as a function of mass 
bh{M), weighted by the galaxy occupation function A'^^: 



dn 
dM 



{M)hh{M)Ng{M)dM. 



(7) 



For the halo bias bh, we use the expression of jheth et al 
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Table 1. Observational correlation function parameters, for several different samples and methods, assuming the same ACDM cosmology 
used throughout our analysis. SPEC indicates a sample with spectroscopic redshifts, PHOT to the ground-based samples of photometric 
LEG candidates, and IfDF to the deeper sample of U300 drop-outs from the HDF North and South. HDF-N photo-z is the sample of 
HDF North galaxies with photometric redshifts in the range 2.5 < z < 3.5. All magnitude limits are given in the AB system, and are the 
authors' stated completeness limits (note that the SPEC sample of A98 and GDOl are just subsamples of the Adelberger 2000 sample). 
CIC refers to the counts-in-cells method and w{9) to the inversion of the angular correlation function; ang CIC refers to a counts-in-cells 
method of measuring w(9). Where 7 is given in square brackets, this indicates that the value was assumed rather than derived. The bias 
values are calculated under the assumption that 7 = 1.5 for both galaxies and dark matter, and that ro,DM = 1-2. While many of the 
samples have considerable overlap, they do not necessarily consist of the same galaxies. 



Sample Method magnitude limit ro [comoving h ^Mpc ] 7 reference bias 



SPEC CIC n = 25.5 6 ± 1 [1.8] A98 3.3 

SPEC w(9) n = 25.5 3.8 ±0.3 1.61 ±0.15 Adelberger 2000 2.4 

SPEC CIC 7?. = 25.0 5.0 ±0.7 2.0 ± 0.2 GDOl 2.9 

PHOT w{e) H = 25.5 3.2 ±0.7 2.0 ± 0.2 GDOl 2.1 

PHOT ang CIC 7^ = 25.5 ^■^-i°5 l-50to.f P^Ol 2.5 

HDF w{e) V606 = 27 1.4 ±1.0 2.2lJ5'^ GDOl 1.1 



HDF-N photo-z w(9) Isu = 28.5 2.78 ± 0.68 [1.8] Arnouts et al. 1999 1.9 



(2001) based on ellipsoidal collapse: 

bhiM)^l + ^-\^{ai^^) + ^b(auy-''^ (8) 
y/adc 



(ai/2)= + 6(l-c)(l-c/2)J ' 

where b — 0.5 and c = 0.6. Note that because the bias is 
unaffected by random sampling or the overall normalization, 
bg is independent of AIi (and also any uniform selection 
probability p) . 

The number density of galaxies in close pairs ricp can 
be written as the contribution of two pieces: 

TT-cp = ^cp ^cp- 

The first piece, n^p, is the number density of close pairs of 
galaxies within the same halo, and the 'distinct halo' piece, 
n^p, represents galaxy pairs coming from objects that do 
not lie within the same halo, and are counted as close pairs 
because of projection effects. 

In order to calculate nj?p, we need the correlation func- 
tion of galaxies inhabiting distinct host haloes ^d{r). On 
scales larger than the typical halo size, will mirror the halo 
correlation function: £,g{r) = ^d(r) = b\(^DM (with b^ the 
halo bias calculated from Eqn. p|). We expect this assump- 
tion to break down on small scales, near the scale where the 
virial radii of the haloes begin to overlap, dh = 27?v, where 
dh is the diameter (twice the virial radius) of the average- 
mass halo under consideration (which implicitly depends on 
Mmin). The fact that haloes are mutually exclusive in space 
demands that the correlation function go to zero (and to —1) 
at some scale below dh. A simple assumption is i,dij') — 
for r < dh, and remarkably, when we make this assumption, 
we reproduce the projected close pair counts derived from 
the N-body simulations discussed in WOl to an accuracy of 
5-20 per cent. Although the true nature of is certainly 
more complicated, a calculation of this accuracy is sufficient 
for the level of observational precision relevant to this work, 
so we adopt this simple break-radius form for for the rest 
of our analysis. 

Given and the volume V defined by the bin geome- 



try, the number of expected pairs is now straightforward to 
determine. The average number of pairs within V is: 

{Npai.s)^ = 0.5{iV(iV-l))^. (10) 

Making use of 

a\N) = (N^) - {Nf = nl [ Uri2)dVidV2 + (N) , (11) 

Jv 

we obtain 2 < Np^irs >= o-^(Ar) - (A^) + (A^)^. The num- 
ber density of pairs for galaxies in distinct host haloes is 
therefore 

nfp = 0.5n^ J Uri2)dVidV2 + . (12) 

Here, ri2 is the distance between the volume elements dVi 
and dV2- 

The second piece of the close-pair expression, njp, 
is obtained by integrating the expected pair counts, 
{Ng{Ng — 1)), in haloes of a given mass over the halo mass 
function. Later, we will explore how scatter in the halo occu- 
pation function affects the close pair counts, but for now, we 
make the limiting assumption of zero variance. This implies 
{Ng{Ng - 1)) = Ng{M)[NgiM) ^ l], sud gives 

-^Ng{M)[Ng{M)-l]dM. (13) 

The lower limit of the integral is M» =max(Mmin, Mi). 

This two-piece approximation for calculating the ex- 
pected close pair counts provides a clear and intuitive picture 
for what the close pairs represent physically. The projected 
piece, n^p, depends mainly on ^g and Ug, so that at a fixed 
bias, it is nearly independent of how the halo occupation 
function varies with mass. The nj^p piece, however, depends 
strongly on the occupation function, and in particular on 
the slope S. Note that the close pair calculation neglects 
redshift space distortions. 

We stress again that as we are dealing with a sample 
which has an uncertain relationship to the intrinsic under- 
lying population, all of the above definitions refer to the 
numbers of objects that would be included in our observa- 
tional sample. For example, we explicitly define the occupa- 
tion function, A^g, to be the number of observed LBGs per 
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Figure 1. The critical mass Mi for hosting more than one ob- 
served galaxy as a function of the minimum mass for hosting any 
observed galaxy, for different values of the halo occupation slope 
and galaxy bias. The observed number density is fixed to the ob- 
served value for the AOl sample of 2 ~ 3 Lyman-break galaxies 
(see text). Thin solid lines correspond to fixed values of S; dashed 
lines correspond to constant bias values. The shaded band indi- 
cates the allowed region for bg = 2.2 — 2.5, the range of bias values 
favored by the Adelberger (2000) analysis. 

halo, rather than the actual number of galaxies per halo. 
In this language, the number of galaxies that actually exist 
per halo will be p~^Ng. This uncertainty in normalization 
translates to an uncertainty in the 'intrinsic' value of Mi 
via Ml — p^/^Mi, but does not affect the other estimates. 
The special case of one-galaxy-per-halo {S — 0.0) is a bit 
different. For models of this kind. Mi is undefined, and p 
must be defined explicitly. However, in this case, the model 
will still contain the same number of parameters, with Afi 
replaced by p. 

The model outlined above is general in the sense that 
it could be applied to any population of galaxies whose halo 
occupation function is well approximated by a power law, 
over scales where the assumption of linear, scale-free) bias 
is sensible. Here, we proceed to apply it to the single example 
of Lyman-break galaxies in the ACDM cosmology specified 
above. 



4 CONSTRAINTS FROM LYMAN-BREAK 
GALAXIES 

In this section we will use the observational data derived 
from the 2 ~ 3 sample of Lyman-break galaxies summarized 
in Section ^ to constrain the halo occupation function for 
these objects. Fig. ^ shows the relation between the model 
Ml and Mmin values obtained by inverting equations ^ and 
for given values of the number density rig, bias bg, and 
occupation function slope S. In all cases, we fix the number 
density to the observed value of Ug given in Section ^ The 
thin solid lines show the relation for different values of the 
occupation function slope S. As A/min decreases, galaxies 



may inhabit smaller mass haloes, which are more numerous. 
In order to maintain the observed number density, the mass 
Ml at which a host halo contains one galaxy must increase 
correspondingly. The change in Mi as a function of Mmin is 
steeper for smaller values of S. 

If instead of fixing S, we fix the bias bg (as well as the 
number density), we obtain the dashed lines in Fig. |l| By 
comparing the dashed and solid lines, it is evident that for 
fixed values of Mmin, the bias is an increasing function of 
the slope S. This is because, for fixed Mmin, and for larger 
values of S, more galaxies reside in larger mass hosts. It is 
also evident that the galaxy bias is a stronger function of 
Mmin for high-S models. 

The thin vertical line corresponds to an S = 0.0 model 
with p = 1.0. This corresponds to the simple 'one-galaxy- 
per-massive-halo' type model that has often been consid- 
ered in previous works. As we discussed in Section ^, for 
the special case S = 0, Mi is irrelevant and we must assign 
a value for the selection probability p in order to obtain a 
value for Mmin that provides the desired (observed) density 
Ug. Since p cannot exceed 1.0, this vertical line represents 
an upper limit on the value of Mmin for a given value of 
rig^. Note that for this reason, it is possible to find com- 
binations of Ug and bg that cannot be reproduced simulta- 
neously in this sort of model (ie. if rig is 'too high' for the 
given bias value). Similarly, as p is lowered in an S = 0.0 
model, subject to the constraint that the observed value of 
rig is fixed, the implied 'true' number density increases, and 
Mmin must be reduced accordingly. Since low-mass haloes 
are less strongly clustered, the corresponding predicted bias 
will also decrease. The (dashed) lines of constant bias in 
Fig. approach vertical asymptotes at the value of Mmin 
that produces this bias in the S = 0.0 case. For example, 
if S = 0.0 and p — 0.1, then the observed number density 
requires Mmin 1.2 x 10^^/i"^Mq and gives bg ~ 2.2. Note 
that the bg = 2.2 dashed line approaches its vertical asymp- 
tote at the same value of Mmin- 

Even if only the number density and large-scale bias 
are known. Fig. |l] already places strong limits on the halo 
occupation function for 2 ~ 3 galaxies. For example if the 
observed LBG bias is constrained at bg < 3, then the critical 
mass above which halos host more than one observed galaxy 
at z ~ 3 must be rather large. Mi > 4 x lO^^h'^M^ , re- 
gardless of the values of 5* and Afmin. If bg is estimated to 
lie within some well-defined range, then, independent of the 
the occupation function slope 5", the model parameters Mi 
and Mmin must lie in a region defined by the two (dashed) 
bias lines corresponding to that range. For example, the 
bias estimate from Adelberger (2000) is bg ~ 2.2 — 2.5. 
The allowed region implied by this measurement is indi- 
cated by the shaded band in Fig. |l| Specifically, for Mi > 
5 X 10"/i"^Mq this implies Mmin ~ (1 - 2) x 10"/i"^Mq , 
and for Mmin < 10"/i"^Mq (and S > 0), it implies 
Ml ~ (5 - 50) X 10^^/i-^Mq . 

In Fig. 0, we have again fixed the number density to 

^ Although it should be noted that for the LBG sample, the 
known selection effects due to sub-sampling and the color- 
selection technique mean that the selection probability is almost 
certainly less than about 25 percent (p ^ 0.25); see Section ^ 
Taking this into account would lead to a correspondingly lower 
value for Mmin- 








Figure 2. The fraction of pairs within an angular separation of 20" for three different z binnings: = 0.005, 0.010, and 0.040, shown 
as a function of the galaxy bias. The four lines in each panel correspond to halo occupation slopes of S = 1.1, 1.0,0.8,0.6 and 0.4. The 
number density of galaxies is fixed to the observed value for spectroscopically-confirmed LBGs. The sharp upturn in the S = 0.4 line 
occurs at a bias of ~ 3.2, which is the maximum bias for an 5 = 0.0 model (p = 1.0). In order to obtain a bias greater than this for the 
S = 0.4 case, Afmin must be larger than Mi, which forces all haloes to host more than one galaxy on average, and thus increases the 
close pair fraction drastically. The data points and solid-line error bars show determinations of fcp and bg from the AOl and Adelberger 
(2000) samples respectively. The dotted-line error bars reflect the range in bg determinations from a number of recent estimates (see 
text), providing a reasonable estimate of the systematic uncertainty in determining bg. 



match the observed value, but now we plot the model predic- 
tions in the plane of close pair fraction versus the large-scale 
bias (note that both quantities are directly observable). Re- 
call that in estimating fcp , we use angular bins of radius 20" 
(0.4 comoving h~^Mpc at z = 3 in our adopted cosmology). 
The pairs are defined in redshift bins of Az = 0.005, 0.010, 
and 0.040, as indicated separately in each panel. The solid 
lines show model predictions for fixed values of the slope, 
S = 1.1,1.0,0.8,0.6, and 0.4. The high-S' lines lie above 
those of lower S. The lines are truncated at a bias corre- 
sponding to Mniin = 10*/i~^Mq . Thls is an extremely con- 
servative lower limit on the expected mass of halos that can 
host LBGs — two orders of magnitude s smaller than lowe r 
limit on the LBG host mass derived by Pettini et al. (2001) 



using equivalent widths of nebular emission lines. 

The curves tend to be flatter at low bias and to rise 
more steeply at large bias. This corresponds to a transition 
between a close pair density that is dominated by projection 
(ricp) at low bias, and is dominated by objects within the 
same halo (n^p) at high bias (see Eq. m. As the close pair 
fraction is calculated for larger and larger bins in redshift 
Az, the contribution due to galaxies in projection becomes 



larger, while that coming from objects within the same halo 
remains the same. This is why, for low-bg, the values of fcp 
generally change from one panel to the next, while for high- 
bg there is very little change. This tendency also provides 
an additional constraint. In principle the model can be con- 
strained with one choice of Az, but by looking at data with 
various choices of the redshift bin, we can get an additional 
handle on the fraction of the close pairs coming from same- 
halo galaxies and from projection effects. For example, a 
strong change in close pair fraction with z binning would 
indicate that the close pairs are dominated by projection 
effects. 

The behavior at high bias, corresponding to the single- 
halo dominated regime, can be understood by examining 
Fig. |l| For a flxed slope, the typical number of objects within 
the same halo depends primarily on the value of Mi relative 
to A/niin. Since Mi is the mass above which haloes typically 
host more than one galaxy, if Mi is much larger than Mmin, 
then most haloes will have fewer than one object, and the 
close pair fraction will be small. Similarly, if Mi ~ Mmin 
then a large fraction of haloes will host multiple galaxies, 

-bg 
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relation are steeper for low-5 models because the Mi — Mmin 
relation is also steeper for these models. The transition 
from projection-dominated to same-halo-dominated occurs 
at larger bias for lower S models for the same reason. In 
contrast to the single-halo piece, the projection-dominated 
(low-bias) regime tends to be steeper for high-S models. The 
reason again is associated with the slopes of the Mi — Mmin 
curves shown in Fig. |l| The projected close pair density, 
Ucp, (Eq. ^|) is proportional to the amplitude of the dis- 
tinct halo correlation function. Therefore one might expect 



cial bias range and the right panel corresponds to the ex- 
panded bias range. No strong statistical significance should 
be attached to the filled bands — they simply represent ar- 
eas in parameter space that provide overlap between the 
theoretical predictions and the data shown in Fig. ^ We 
have neglected any error in the analytic estimates, which 
is likely of the order of ~ 10 per cent in fcp and bg. The 
main point to take away is that the allowed parameter space 
has been significantly reduced compared to that shown in 
Fig. |l[ For our fiducial bias uncertainty we find that M^ 



that ricp simply would grow as a function of bias, and indeed should lie roug hly in the range (0.4 - 8) x 10^°/i"^M and 



it does for high-S models. The reason why the low-S mod- 
els show very little change as a function of bias is that this 
tendency is compensated by the rapid change in Mmin as a 
function of bias in these models. As Afmin increases, so does 
the typical halo size, and therefore the region over which the 
halo-exclusion drives the separate halo correlation function 
to zero becomes larger. For S = 0.4, the typical halo exclu- 
sion size grows so rapidly with the bias that it cancels out 
the effect of increasing the correlation function amplitude at 
large scales. (The S = case looks nearly identical to the 
S = 0.4 out to a bias of about 3.2, which is why we do not 
plot any smaller values of S). 

The data points on Fig. ^ show the estimates of close 
pair fraction and 1 — cr uncertainties derived from the AOl 
data set, as discussed in Section ^ along with the bias valu e 
and its statistical 1 — a uncertai nty from Adelberger (2000 ). 
We use the bias and error from Adelberger (2000| ) because 
this sample is very similar to that of AOl. However, the for- 
mal uncertainty quoted for this value likely underestimates 
the precision to which we know the value of bg, since best 
estimates tend to vary for different samples and methods 
(see Table 1). In order to allow for this, we show a dotted 
error bar, which spans the range of recent det erminations 
{bg = 2.1 — 2.9). From now on, we refer to the Adelberger 



Ml 



(6 - 10) X 10^^/i"^M . For the expanded uncer- 



(2000^~[mcertainty (solid line error bar) as our fiducial bias 
range and the larger uncertainty (dotted line error bar) as 
our our expanded range. 

The data seem to favor a model with 5 ~ 1.0. The 
fact that this slope seems to reproduce the observed fcp 
counts (within fiducial errors) for all three Az binnings is 
encouraging, and is also an indication that our simplified 
halo occupation model provides a reasonable description of 
the galaxy-to-halo relation for this sample. The Az — 0.01 
panel provides the strongest constraint, and within the fidu- 
cial 1 — cr uncertainties shown, the compatible range of model 
slopes isSiiO.Q — 1.1, with larger values of bias preferring 
shallower slopes. The expanded bias range is consistent with 
a slightly larger slope range S ~ 0.8 — 1.1. 

Fig. ^ maps the constraints shown in Fig. ^ to the al- 
lowed regions in Mi and Mmin parameter space. Allowed 
regions are filled with closely spaced slanted lines. For each 
observationally-consistent value of bg, there is a range of S 
values that are allowed by the close pair constraint. Corre- 
spondingly, there are well-defined regions in Mi and Mmin 
space consistent with each S and bg combination, as indi- 
cated by the slanted line-filled bands. It is important to re- 
alize that each point in a filled region represents a unique 
model combination of S, Mi, and Mmin - The solid lines, rep- 
resenting models of constant S in this space, are shown to 
help guide this understanding. 

The left panel shows the region consistent with our fidu- 



tainty range, the allowed space for Mmin expands as well, 
Mmin ^ (0.1-20) X W^^h'^Mg , but the Mi range remains 
roughly the same. Note that larger allowed values for Mmin 
correspond to lower values of S (at fixed fcp)- 

It is useful at this point to explore how an intrinsic 
scatter in halo occupation would affect these estimates of 
the allowed model parameter space. Until this point, we have 
assumed a deterministic relation between a halo's mass and 
the number of galaxies it hosts, but any realistic model of 
galaxy formation will surely predict at least some scatter in 
this quantity. In principle, this distribution about the mass- 
occupation relation could be treated as an additional input 
of the model, however, here we will work out a simple case 
motivated by semi-analytic galaxy formation scenarios in 
order to illustrate how scatter affects our results. 

Scatter in the halo occupation function will have no ef- 
fect on the predictions of number density and large-scale 
bias, and will alter only the close pair fraction associated 
with same-halo pairs. In order to account for these, we must 
alter Eq. ^by replacing Ng{Ng — l) with the appropriate ex- 
pression for {Ng{Ng — 1)). Although one might suppose that 
a Poisson distribution would be a reasonable assumption, 
{Ng{Ng — 1)} = Ng{M), such an assumption is physically 
unrealistic for Ng ^ or M < Mi. As the host mass falls 
below that typical for containing an object, the likelihood 
for it to host any additional objects becomes suppressed sim- 
ply by mass counting arguments. This kind of sub-Poisson 
scatter is seen for low-mass hosts in semi-analyt ic models 



of galaxy formation (e.g. Scoccimarro et al. 2001). For our 



illustrative example, we will use the same halo pair counting 
observed in the semi-an alytic models presented in WOl and 
Somerville et al. (2001cj ), which becomes sub-Poisson below 



Na 



{Ng{Ng-l)) 



N^ln{mg)/ln{i) 




Ng ^ 1 



0.25 <Ng <1 (14) 



We have suppressed t he implicit mass depende nce in Ng = 
Ng{M). Although the Somerville et al. (2001a ) models are 
best described by an occupation function with S ~ 0.7 — 0.8, 
we assume here that the above formula holds for all values 
of S. 

The results of this calculation are shown in Fig. ^ When 
the average number of objects is small, including scatter 
increases the probability of having multiple objects in the 
same halo, and consequently increases the expected close 
pair count. For this reason, the steeply rising portions of the 
fcp — bg curves begin to become important at lower values 
of bg relative to those in Fig. ^ It is encouraging that even 
with the inclusion of this substantial amount of scatter, the 
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Figure 3. Allowed region of halo occupation function parameter space implied by all three observational constraints (rig, fcp and bg), 
with and without scatter in the halo occupation statistics. Solid lines show models of constant slope S. (Left) Filled regions indicate 
the parameter space consistent with 1 — a bias errors from Adelberger (2000). (Right) Filled regions show how the allowed model space 
expands when a larger range for the bias is allowed (see text). 



data appear to favor a slope similar to that suggested by 
the zero-scatter models shown in Fig. ^, although slightly 
lower, S ~ 0.9. The fiducial 1 — a errors overlap with the 
model curves for 0.7 < 5 < 1.0, and the expanded bias un- 
certainty extends the allowed range to somewhat shallower 
slopes, 0.4 < S < 1.0. The corresponding allowed regions in 
Mi-Mmin space for the fiducial (left panel) and expanded 
(right panel) bias ranges are indicated by the vertically- 
shaded bands in Fig. |^. The ranges of preferred Mmin values 
have significant overlap with those in the zero-scatter case: 
Mmin ~ (1 — 13) X 10^°/i~^Mg for fiducial bias uncertainty, 
and Mmin ^ (0.6 - 40) x lO^^/i'^M^ for the expanded case. 
However, the preferred regions are offset towards higher Afi 
by roughly a factor of two; this is due to the fact that increas- 
ing the scatter increases the close pair fraction and thus the 
number of objects per halo must be decreased by increas- 
ing Ml. Note, however, that the constraint on the minimum 
halo mass Mmin is more robust to the inclusion of scatter. 



5 LUMINOSITY/NUMBER DENSITY 
DEPENDENCE 

So far, we have considered observational constraints for a 
population of galaxies with a given magnitude limit. How- 
ever, it is interesting to consider how the predicted prop- 
erties would change for samples with different magnitude 
limits. Because the observed number density is a function of 
the magnitude limit of the sample (for a sample of known 
comple teness), this can also be thought of as considering 



different values for the observed number density. 



Suc h a prediction is particularly re levant in light of the 



and intrinsic galaxy luminosity. GDOl compared correlation 
lengths obtained from ground-based spectroscopic and pho- 
tometric samples, and a deeper sample of LBGs identified 
in the Hubble Deep Field. They found that the correlation 
length strongly decreased as the magnitude limit of the sam- 
ple grew fainter, or, similarly, as the observed number den- 
sity of the population increased. If correct, this result has in- 
teresting implications for the relationship between observed 
galaxies and dark haloes. Unfortunately, a prediction for the 
expected clustering as a function of observed number den- 
sity is rather unconstrained in our model. This is because, 
in principle, all three of our model parameters could vary as 
a function of galax;y luminosity. 

We can overcome this problem by making some plausi- 
ble simplifying assumptions. For example, perhaps the sim- 
plest possibility is that the value of S stays fixed when 
the number density /luminosity cutoff of a sample changes 
(which is roughly true in the semi-analytic models of 
Somerville et al. 2001b), and that Mmin and Mi vary to- 



gether in the natural way Afmin oc Mi . This assumption may 
be motivated qualitatively by assuming that Mmin is propor- 
tional to the minimum observable galaxy luminosity, and 
that the host haloes themselves are self-similar as smaller 
and smaller haloes become important. For S = 0, the Mi 
assumption cannot apply, and is replaced by the assumption 
that selection probability p remains fixed. 

The resulting model predictions are shown in Fig. |^, for 
four values of the halo occupation slope S. We have normal- 
ized each model so that it has bg = 2.4 at the number den- 



work by Giavalisco & Dickinson (2001, GDOl), which sug- 



gests a correlation between the LEG clustering amplitude 



sity of the spectroscopically confirmed sample of Adelberger 
(2OOOI ), and predict how the bias should vary as function of 
Ug. The 5 = model shows the steepest dependence because 
the number density can be increased only by adding galaxies 
to increasingly lower mass (and less clustered) haloes. 




Figure 4. Close pair fraction as a function of bias for different values of the halo occupation slope, as in Fig. y, but including a non-zero 
scatter about the halo occupation relation, as described in the text. The number density is fixed to the observed value for the AOl 
sample of Lyman-break galaxies, as usual. The fraction of pairs within an angular separation of 20 arcsec is shown for three different 
z binnings: Az = 0.005, 0.010, and 0.040, as a function of large-scale bias. The four lines in each panel correspond to halo occupation 
slopes of 5 = 1.0, 0.8, 0.6, 0.4, and 0.2. 



The data points correspond to the observational esti- 
mates. The trianales show th e results of GDOl. The square 
reflects the Adelberger (200(][ ) estimate. Th e angular corre- 



lation function result from recent work by Porciani & Gi- 



avaliscc (2001) is s hown by the pentago n. The filled circle 
shows the results of ^.rnouts et al. (1999) based on a sample 
from the HDF with a similar magnitude limit as the GDOl 
HDF data, but selected via photometric redshifts rather 
than the Lyman-break t echnique. We have not included the 
Adelberger et al. (1998) determ ination of the bias because 
it has been superseded by the Adelberger (2000| ) sample, 
which uses a larger sample of galaxies and the same counts- 
in-cells method. In light of the disagreement between the 
various estimates at fixed density, the strength of the trend 
must be regarded as rather uncertain. If the GDOl points 
are neglected, then all four models are consistent with the 
data, but taken together, the data seem to favor a model 
closer to the 5* = 0.0 line, unlike the close-pair and bias con- 
straints discussed in the previous sections, which favored 
5^0.9-1.0. 

However, it is important to note that the GDOl points 
cannot be reproduced by any of these minimum-assumption 
models, as even the S = 0.0 model trend is too shallow. The 
only way to obtain such a trend would be to assume that 
LBGs tend to avoid the most massive haloes, corresponding 



to a negative slope S. Otherwise a trend between clustering 
and number density as steep that indicated by GDOl can 
only be accounted for by breaking one of our simplifying as- 
sumptions. For example, if the selection probability, p, varies 
systematically with the observed number density, then the 
low bias of the higher density sample might indicate that 
the true number density is simply much larger than that ob- 
served. We find that only if the selection probability varies 
inversely with the observed number density: p oc n^^ (with 
p ^ 1), can we reproduce a trend this steep with 5 = 0. 
Although one might expect the selection probability to be 
higher for brighter objects, such a strong trend seems prob- 
lematic for other reasons. Recall that the value of p can only 
effect the estimate of the number density, not the bias, so 
this would imply that there is almost no difference in the 
number density of LBGs at 7?. = 25 and TL ~ 27, which is 
at odds with observational deter mination of the lum inosity 
function of LBGs over this range (Steidel et al. 1999). 



6 CONCLUSIONS 

We have presented a simple and intuitive method for con- 
straining the relationship between observed galaxies and 
their dark matter haloes, and used it to constrain the galaxy- 
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Figure 5. Bias (lett axisj, and correlation length (right axis J as 
a function of comoving number density, for various values of the 
slope. The data points are described in the text. Note that the 
points at high and low number density have been offset slightly 
in the horizontal direction in order to allow their error bars to 
remain distinct. 



halo occupation relation at z ~ 3. Using a three-parameter 
model of the form Ng(M) = (M/Mi)^, M > Mmin to de- 
scribe the number of observed galaxies as a function the host 
halo mass, we derived predictions for three observables: the 
galaxy number density, rig, large-scale bias, bg, and the frac- 
tion of galaxies in close pairs, fcp. Given these three observed 
galaxy properties, the three unknown model parameters can 
be constrained. 

We presented estimates of the allowed range for these 
three parameters based on the properties of g alaxies in a 
sam ple of spectroscopically confirmed LBGs ( Adelberger 
200C 



2D01). The results are summarized in Fig. H. For 



model with no scatter about the halo occupation relation, 
the favored values of the slope lie in the range 0.9 < S < 1.1, 
with preferred characteristic halo masses Mmin — (0.4 — 
8) X 10^°/i-^Mq and Mi ~ (6 - 10) x lO^^/i'^Mg . For 
a model where the halo occupation function scatter is esti- 
mated based on semi-analytic models, the range of preferred 
model parameters shifts to 0.7 < S < 1.0, Mmin — (1 — 13) x 
10^°/i"^Mq and Ml ~ (8- 15) x 10^^/i"^Mq . Since the ob- 
servational uncertainty in bg is likely significantly larger than 
the formal error derived for the sample of LBGs we consider, 
we have also explored a range of biases consistent with all 
of the recent estimates for LEG clustering. Using this ex- 
panded range of uncertainties, we obtain 0.8 < S < 1.1 and 
Mmin ^ (0.01 - 20) X 10^°/i"^Mg for the case of no scat- 
ter, and 0.4 < S' < 1.0, Mmin (0.6 - 40) x 10^°h-^M^ 
if scatter is included. Preferred values of Mi are relatively 
insensitive to increasing the uncertainty in bg , and preferred 
values of Mmin are relatively insensitive to the inclusion of 
scatter in the occupation function. 

It is interesting that the data favor a model in which the 
minimum halo mass for hosting an observable object, Mmin, 
is significantly smaller (by roughly two orders of magnitude) 
than the mass range above which haloes typically host more 



than one observed object, Mi. However, recall that the av- 
erage mass halo hosting an observable object is typically 
lower: Mi = p^^'^Mi, so the difference between Mmin and 
Ml ranges from about two orders of magnitude for high S 
values to no difference for S' = 0. In any case, this implies 
that most haloes hosting LBGs do not contain more than 
one observable object. However, as we have seen, because the 
most massive haloes are also the most clustered, the cluster- 
ing predictions are quite sensitive to the treatment of occu- 
pation statistics in these haloes even though they constitute 
a small fraction by number. In addition, it is interesting 
that the range of allowed values for Mmin have considerable 
overlap with mass estimates ba sed on the widths o f nebular 



emission lines, ~ lO^^h ^Mg ( Pettini et al. 2001 ), bearing 



in mind that these line-widths may yield underestimates of 
the true virial masses. Line-width analyses such as this may 
provide useful additional constraints for the type of model 
presented here. For example, one might be inclined t o elimi- 
nate models with Mmin 5, W^'^ h~^M„ based on the Pettini 



et al. (2001) analysis, and thus significantly reduce the al- 



lowed model parameter space. However, these line widths es- 
timates are based on a relatively small sample of very bright 
objects, so placing strict limits on Mmin, which is generally 
much smaller than than a 'typical' LEG host, Mi, may not 
be justified. 

Although the current level of observational uncertainty 
prevents us from precisely defining a favored model, the 
identification of an allowed range of parameter space already 
provides useful constraints on more sophisticated modelling 
aimed at understanding halo occupation at a more basic 
level. For example, as mentioned in the introduction, the 
first, and, for some, still the favored model for LEG oc- 
cupation is o ne in which there is one galaxy per halo, 
S = 0.0 (e.g. Wechsler et al. 1998). Our results disfavor 



such a model because it under-predicts the close pair counts 
(see als o WOl). Similarly , naive models for collision-driven 
LEGs ( Kolatt et al. 199£ ) predicted a steep halo occupation 
function, 5* ~ 1.1, which is the steepest slope consistent 
with our results and requires that LEGs populate a frac- 
tion of very low-mass halos (although in WOl we explain 
why the mo re sophisticated treatme nt of collisional bursts 
presented in somerville et al. (2001b) yields a shallower rela- 
tion, 5* ~ 0.7). The preferred range of slopes from our analy- 
sis including scatter is in agreement with the range of slopes 
from semi-analytic estimates for the occupation function for 
a range of star formation models (WOl), including those in 
which the primary mode of star formation is merger-driven 
starbursts, and also those in which quiescent star formation 
dominates. However, it seems unlikely that the slope will be 
constrained well enough in the near future to distinguish be- 
tween th e different semi-analytic mo dels compared in WOl. 
Work by Porciani & Giavalisco (2001) using a counts-in-cells 
analysis of the angular correlation function of LEGs (using 
an overlapping but different sample of galaxies than con- 
sidered here) favors a model with a shallower slope; their 
analysis is consistent with S=0. Some of the discrepancy 
may result from the more highly biased sample they con- 
sider. (The full AOl sample includes new fields that are less 
clustered than the first few fields studied). As illustrated 
in Fig. 2, for a fixed fcp, a higher bias favors a shallower 
slope. Nonetheless, neither our constraints or theirs are very 
strong, and we emphasize as they do that current samples 
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may not yet be a fair representation of the high-redshift 
galaxy population. When the sample becomes larger, these 
complementary types of analyses should be applied in par- 
allel to constrain the halo occupation function. 

We then applied this simple model to investigate how 
the clustering of a population of galaxies might change as 
a function of their observed number density, or, implicitly, 
as a function of their intrinsic luminosity. Under the sim- 
plest assumptions for how model parameters should vary as 
a function of luminosity cut, the low-S models vary more 
strongly with number density than do high-S* models. None 
of the models are steep enough to match the trend found by 
GDOl, but models with S — 0.0 — 1.0 a re consistent with 
the change in bias betwe en the sample of Adelberger (200C ) 
and Arnouts ct al. (199i: ) — so this data cannot yet provide 
significant constraints. More data which can provide smaller 
error bars on observational parameters, especially the bias, 
could prove a valuable constraint for the halo occupation re- 
lation. We have shown how combinations of our three model 
parameters are currently constrained by the data, but with 
the current data sample individual parameters (such as the 
slope S or the minimum mass for hosting a halo Afmin) are 
not well determined. This will probably await a significantly 
larger survey, such as one that could be completed using, 
e.g., the Very Large Telescope (VLT) or the Large Binocu- 
lar Telescope (LBT). 

We have discussed here how the formalism presented 
in §^ can be applied to determine the halo occupation of 
a specific population of LBGs at z ~ 3, but in fact it is 
quite general, and could be applied to constrain the halo 
occupation models for a variety of galaxy populations, and 
used to understand the relation between various galaxy pop- 
ulations. The limitation of this method is that, as we have 
shown, it requires a large sample to be able to put strong 
constraints on model parameters. Moreover, the method re- 
lies on the simplifying assumption of scale-independent bias, 
and in cases where the slope of the galaxy correlation func- 
tion is significantly different from that of the dark matter, it 
becomes ill-defined. However, with new generations of tele- 
scopes, large samples will become available for an increasing 
number of types of high redshift galaxie s. Once statistics be - 
come available from the LALA survey (Rhoads et al. 2000), 
a similar approach to the one we have presented could be 
used to understand whether Lyman-a emitters and LBGs 
populate the same dark haloes, or to relate the LBG popu- 
lation to that of SCUBA sources — such an analysis would 
be complementary to the analysis of ^hu, Mao fc Mo (2001 ) 
which uses star formation rates and the observed size distri- 
bution to constrain halo occupation models. Similar meth- 
ods can also be used to relate the LBG population at z = 3 
with galajcy populations at different redshifts (Moustakas & 
Somerville, in preparation) . This basic formalism could also 
easily be applied to quasars identified in the Sloan Digital 
Sky Survey, as a tool for understanding the halo occupation 
of quasars as a function of redshift and luminosity. 
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